-
Notifications
You must be signed in to change notification settings - Fork 256
[YUNIKORN-3140] Quota change through preemption config changes #1034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1034 +/- ##
==========================================
+ Coverage 80.91% 81.44% +0.52%
==========================================
Files 99 101 +2
Lines 12882 13308 +426
==========================================
+ Hits 10424 10839 +415
- Misses 2201 2209 +8
- Partials 257 260 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make the config change and nothing else in this PR. Leave out the Timer inclusion because that needs to be discussed further.
I definitely don't like having a bunch of timers running & firing, doing things in parallel. I understand that this is mentioned in the design docs, but I think we should maintain our "timer" as an abstract concept - eg. calculating when the preemption should occur by refreshing a value inside the Queue object. We call ClusterContext.schedule() in every 100ms anyway, so this should not be a big deal.
Look at the link I sent about Timer.Reset(). It's much safer if we maintain delays/deadlines on our own.
cc @wilfred-s
Ok, we can discuss in a separate PR and keep only config changes here. Quota change through preemption is a one off event does not occur often, so not really sure whether clubbing this with critical core scheduling path is a reasonable thing to do, and need to ensure doing this does not impact other existing process in the cycle and vice-versa. |
|
@manirajv06 could you rebase the changes? |
edefe75 to
82d406f
Compare
Rebased the changes. Please check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nits
|
e2e test failures are not related to this change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What is this PR for?
Introduced a config "preemption.delay" to let admins to set the graceful period before triggering the preemption in case of any quota decrease.
What type of PR is it?
Todos
What is the Jira issue?
https://issues.apache.org/jira/browse/YUNIKORN-3140
How should this be tested?
Screenshots (if appropriate)
Questions: